Re: [SQL] soundex more or less exact - Mailing list pgsql-sql

From Herouth Maoz
Subject Re: [SQL] soundex more or less exact
Date
Msg-id l03110700b1995dbfd3f6@[147.233.159.109]
Whole thread Raw
In response to soundex more or less exact  (guenther@laokoon.in-berlin.de (Christian Guenther))
List pgsql-sql
At 21:02 +0300 on 1/6/98, Christian Guenther wrote:


> How can I make the soundex-function more or less exact finding results.
>
> If I have a name 'Fretwurst' and search for 'Vretwurst' I do not get the
> 'Fretwurst' as result. Is it possible to get it right, because for me
> Vretwurst is near by Fretwurst?

Soundex is not a magic wand. It is a generic name for a function that maps
words that sound the same to one root. But "sound the same" in which
language? I once wrote a soundex function for Arabic. Believe me, it is
nothing like English soundex. It may be tempting to use English soundex for
German - after all, they use the same character set, more or less. But
since pronounciation of "V" and "F" and "W" is different, and I think
there's also some difference in "S", the way things *sound* is different.
It may not be as far as Arabic is from English, but it's enough to render
the function invalid for practical purposes.

That's the theory. In practice, I believe the soundex in PostgreSQL is a
contributed module. So if you have programming skills, you may want to
follow the sources, see how they work, and change the function to apply
German logic rather than English logic. Then you can contribute your own
version as "german_soundex" or whatever.

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma



pgsql-sql by date:

Previous
From: Marin D
Date:
Subject: timespan weirdness (fwd)
Next
From: "Jose' Soares Da Silva"
Date:
Subject: Re: [SQL] problem with the 'date' data type and msaccess95 :(